[MINOR][CI] Serialize federated monitoring/multitenant tests#2517
Merged
Conversation
The **.functions.federated.monitoring.**,**.functions.federated.multitenant.** job was the only federated test group running with the default surefire parallelism (parallel=classes, threadCount=2). Unlike pure-CP tests, these federated tests spawn worker JVMs on fixed ports, run Spark, and share the static /tmp/systemds working directory, so two classes running concurrently in one fork race on those resources. Observed symptoms include "Failed to create non-existing local working directory: /tmp/systemds" and "Federated worker processes on port N died before becoming ready", followed by a leaked worker/Spark thread keeping the fork JVM alive until the 30m job cap cancels it. Run this group with -Dtest-threadCount=1 -Dtest-forkCount=1, matching every other federated group (the federated.primitives.part1-5 groups already use this), so the classes execute serially and no longer contend for ports and the shared working directory.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2517 +/- ##
============================================
- Coverage 71.56% 71.54% -0.03%
+ Complexity 49125 49101 -24
============================================
Files 1575 1575
Lines 189784 189784
Branches 37232 37232
============================================
- Hits 135823 135772 -51
- Misses 43470 43517 +47
- Partials 10491 10495 +4 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
**.functions.federated.monitoring.**,**.functions.federated.multitenant.**was the only federated test group running at the default surefire parallelism (parallel=classes,threadCount=2). These tests spawn worker JVMs on fixed ports, run Spark, and share the static/tmp/systemdsworking directory, so two classes per fork race on those resources.Symptoms
Failed to create non-existing local working directory: /tmp/systemdsFederated worker processes on port N died before becoming readyChange
-Dtest-threadCount=1 -Dtest-forkCount=1, matching every other federated group (thefederated.primitives.part1-5groups already use this), so the classes execute serially and no longer contend for ports and the shared working directory.